Rank | Count | Beginning |
---|---|---|
14756 | 1751 | I |
5474 | 1629 | Det |
3544 | 1205 | Den |
12043 | 1103 | Han |
3070 | 799 | De |
8144 | 748 | En |
13951 | 689 | Hun |
2498 | 457 | Da |
9196 | 374 | Etter |
15578 | 350 | Ifølge |
22766 | 346 | På |
20082 | 316 | Men |
6875 | 245 | Dette |
10509 | 209 | For |
10169 | 193 | Flere |
4953 | 172 | Der |
13498 | 170 | Her |
9034 | 160 | Et |
27364 | 157 | Tidligere |
27549 | 144 | Til |
27813 | 144 | To |
20967 | 142 | Nå |
22083 | 135 | Også |
4264 | 131 | Denne |
19883 | 120 | Med |
28675 | 120 | Under |
25170 | 110 | Selv |
21996 | 108 | Og |
25465 | 100 | Siden |
5136 | 97 | Dermed |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV